VMIproc Runs 81-87

27/11/20, updated 01/12/20 with masking + renorm fix PH

Data analysis for 450/440eV pump-probe, 1$\mu$m focus (I think) - should be comparable to previous analysis posted by TMO team (Taran...?), running from v4 preprocessed data.

Using VMIproc class for VMI data analysis & interactive plotting, including uncertainties.

Implemented:

  • Plotting
  • Inversion
  • Uncertainties (bootstrap protocol)
  • Masking

To do:

  • Covariance

Background:

In [1]:
# # Memory profiler - OPTIONAL for testing
# # https://timothymonteath.com/articles/monitoring_memory_usage/
# %load_ext memory_profiler
# %memit


# # Quick hack for slow internet connection!
# %autosave 36000
In [2]:
# Standard imports
from pathlib import Path
import sys

import numpy as np
import xarray as xr

# Import class from file
sys.path.append(r'/cds/home/p/phockett/dev/tmo-dev/tmo')
import tmoDataBase as tb
# from vmi import VMI  # VMI class
from inversion import VMIproc  # VMI processing class

tb.setPlotDefaults()  # Set plot defaults (optional)

Load data

Set dataset

In [3]:
# Set main data path
baseDir = Path('/reg/data/ana15/tmo/tmolw0618/scratch/preproc/v5')

# Setup class object and which runs to read.
data = VMIproc(fileBase = baseDir, runList = range(81,87+1), fileSchema='_v5')

# The class has a bunch of dictionaries holding info on the data
# data.runs['files']

# Read in the files with h5py
# There is very basic checking implemented currently, just to see if 'energies' is present.
data.readFiles()
Read 7 files.
Good datasets: [81, 82, 83, 84, 85, 86, 87]
Invalid datasets: []

Filtering

In [4]:
# Add a filter to an existing filterset
# Set energy filter to match previous analysis
# TODO: implement event count filter here to check for saturation
# data.setFilter({'signal':{'energies':[0.05, 0.1]}})
data.setFilter(reset=True)

data.setFilter({'signal':{'evrs':[1,1,70]}})  # For v4 data, gas == evrs[:,70] true/false
data.setFilter({'bg':{'evrs':[0,0,70]}})
data.setFilter({'energies':[0.05, 0.1]})
data.filters
Out[4]:
{'signal': {'evrs': [1, 1, 70], 'energies': [0.05, 0.1]},
 'bg': {'evrs': [0, 0, 70], 'energies': [0.05, 0.1]}}

Generate VMI images

Note this defaults to electron coords ('xc','yc').

In [5]:
data.genVMIXmulti()
Generating VMI images for filters: signal
Generating VMI images for filters: bg
In [6]:
# Check some metrics
print(data.data[81]['signal']['metrics'])
print(data.data[81]['bg']['metrics'])
{'shots': (73658, 2000), 'selected': 63130, 'events': 126260000, 'norm': 73658}
{'shots': (73658, 2000), 'selected': 10528, 'events': 21056000, 'norm': 73658}
In [7]:
# Check an image with Xarray plotter (Matplotlib)
data.imgStack['signal'].sel(run=81).pipe(np.log10).plot.imshow()
/cds/home/p/phockett/.conda/envs/ps-4.0.8-local-clone/lib/python3.7/site-packages/xarray/core/computation.py:700: RuntimeWarning: divide by zero encountered in log10
  result_data = func(*input_data)
Out[7]:
<matplotlib.image.AxesImage at 0x7f7f7e54bfd0>
In [8]:
# Subtraction, and set as new image set
data.imgStack['sub'] = data.imgStack['signal'] - data.imgStack['bg']
data.imgStack
Out[8]:
<xarray.Dataset>
Dimensions:  (run: 7, xc: 1048, yc: 1048)
Coordinates:
  * xc       (xc) float64 -0.5 0.5 1.5 2.5 ... 1.044e+03 1.046e+03 1.046e+03
  * yc       (yc) float64 -0.5 0.5 1.5 2.5 ... 1.044e+03 1.046e+03 1.046e+03
  * run      (run) int64 81 82 83 84 85 86 87
    norm     (run) int64 73658 75402 72980 74351 73406 92623 79525
Data variables:
    signal   (xc, yc, run) float64 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0
    bg       (xc, yc, run) float64 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0
    sub      (xc, yc, run) float64 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0
In [9]:
# View full image stack
data.showImgSet()
WARNING:param.RasterPlot16226: Due to internal constraints, when aspect and width/height is set, the bokeh backend uses those values as frame_width/frame_height instead. This ensures the aspect is respected, but means that the plot might be slightly larger than anticipated. Set the frame_width/frame_height explicitly to suppress this warning.

Set centre point

This is set as native pixel value, or assumed to be the max point in the image.

In [10]:
# data.setCentre([553, 546])  # Set explictly
data.setCentre()  # Default case - set as max ind.
data.checkCentre()
WARNING:param.OverlayPlot47307: Due to internal constraints, when aspect and width/height is set, the bokeh backend uses those values as frame_width/frame_height instead. This ensures the aspect is respected, but means that the plot might be slightly larger than anticipated. Set the frame_width/frame_height explicitly to suppress this warning.
In [11]:
# All the coord parameters are set in `centreInds`
# Note here that `cList` gives the array indicies for the centre point, based on the pixel coords set.
data.centreInds
Out[11]:
{'input': [552.5, 547.5],
 'dims': ['yc', 'xc'],
 'yc': {'coord': array(552.5), 'index': array([553])},
 'xc': {'coord': array(547.5), 'index': array([548])},
 'cList': [553, 548],
 'dMax': [494.0, 499.0]}

Basic processing

See the previous demo for info, here we'll just generate subtracted images.

In [12]:
# Subtraction, and set as new image set
data.imgStack['sub'] = data.imgStack['signal'] - data.imgStack['bg']
data.imgStack
Out[12]:
<xarray.Dataset>
Dimensions:  (run: 7, xc: 1048, yc: 1048)
Coordinates:
  * xc       (xc) float64 -0.5 0.5 1.5 2.5 ... 1.044e+03 1.046e+03 1.046e+03
  * yc       (yc) float64 -0.5 0.5 1.5 2.5 ... 1.044e+03 1.046e+03 1.046e+03
  * run      (run) int64 81 82 83 84 85 86 87
    norm     (run) int64 73658 75402 72980 74351 73406 92623 79525
Data variables:
    signal   (xc, yc, run) float64 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0
    bg       (xc, yc, run) float64 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0
    sub      (xc, yc, run) float64 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0

Image inversion

Multiple methods to be supported... so far only Elio's cpBasex is implemented, and this needs a bit of setup.

In [13]:
# For cpbasex, set the basis path
basisPath = r'/reg/d/psdm/tmo/tmolw0618/scratch/results/tmo_ana/calc/G_r512_k128_l4.h5'

# Set also local version - pip installed version not working with loadG in testing (version mismatch issue, or something in import routine?)
pbasexPath = r'/reg/d/psdm/tmo/tmolw0618/results/modules/pbasex' #/pbasex'

# Import module + basis functions
data.setMethod(pbasexPath = pbasexPath, basisPath = basisPath)
cpBasex import OK, checking for basis functions...
Found basis at /reg/d/psdm/tmo/tmolw0618/scratch/results/tmo_ana/calc/G_r512_k128_l4.h5.

Invert a set of images by setting filterSet (if not set, all images in self.imgStack will be processed). Optionally, set normalisation parameters for the images, the default is norm={'type':'max', 'scope':'global'}, which normalises to the global maximum (i.e. over all images).

Currently the options are:

type:

  • max
  • sum
  • raw

scope:

  • global
  • local
In [14]:
# Invert a dataset
filterSet = 'sub'
data.inv(filterSet=filterSet)
/reg/d/psdm/tmo/tmolw0618/results/modules/pbasex/pbasex.py:31: RuntimeWarning: invalid value encountered in true_divide
  betas = IEB[:,nim:].reshape(nx,nl-1,nim)/IE[:,None,:]
In [15]:
# Full results are stored in a dictionay, self.proc[filterSet]
# filterSet = 'signal'
data.proc[filterSet].keys()
Out[15]:
dict_keys(['fold', 'pbasex', 'xr'])
In [16]:
# ... where 'xr' contains the results in an Xarray
data.proc[filterSet]['xr']
Out[16]:
<xarray.DataArray 'IE' (BLM: 3, E: 512, run: 7)>
array([[[ 0.        ,  0.        ,  0.        , ...,  0.        ,
          0.        ,  0.        ],
        [ 0.85542143,  1.18625065,  1.35658582, ...,  1.86054249,
          1.87010694,  2.26394807],
        [ 1.584021  ,  2.1825303 ,  2.49086432, ...,  3.39562632,
          3.40398422,  4.11860754],
        ...,
        [ 0.02192393,  0.02763356,  0.02935417, ...,  0.04305293,
          0.0492009 ,  0.06955178],
        [ 0.02293437,  0.02926899,  0.0306888 , ...,  0.04503984,
          0.05134804,  0.07291196],
        [ 0.02080901,  0.02674488,  0.02783416, ...,  0.04086939,
          0.04652817,  0.06624388]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [-0.715888  , -0.77826999, -0.82973024, ..., -0.89410312,
         -0.97330106, -0.89292027],
        [-0.7049096 , -0.77057544, -0.82194976, ..., -0.89197449,
         -0.97315639, -0.89267838],
...
        [ 0.66048202,  0.73289361,  0.76940053, ...,  0.80619312,
          0.82506544,  0.85819444],
        [ 0.82225295,  0.8468029 ,  0.93491667, ...,  0.96171003,
          0.98558588,  1.00780313],
        [ 0.91236225,  0.90757257,  1.02694784, ...,  1.04782472,
          1.07461533,  1.08992386]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [ 0.42862268,  0.44782799,  0.57983295, ...,  0.62996665,
          0.75070599,  0.6628786 ],
        [ 0.43527086,  0.45635701,  0.58769118, ...,  0.63959001,
          0.76076219,  0.67379061],
        ...,
        [-0.39927194, -0.42682374, -0.37422606, ..., -0.42094869,
         -0.41788483, -0.42413014],
        [-0.54264574, -0.54728369, -0.52934539, ..., -0.56243639,
         -0.53253836, -0.56463216],
        [-0.6223992 , -0.61205552, -0.61605184, ..., -0.6409392 ,
         -0.59555741, -0.64212169]]])
Coordinates:
  * BLM      (BLM) int64 0 2 4
  * E        (E) float64 0.0 0.000359 0.001436 0.003231 ... 93.01 93.38 93.74
  * run      (run) int64 81 82 83 84 85 86 87
    XS       (E, run) float64 0.0 0.0 0.0 0.0 0.0 ... 2.239 2.707 3.082 4.388
    pixel    (E) int64 0 1 2 3 4 5 6 7 8 ... 503 504 505 506 507 508 509 510 511
    mask     (E, run) bool False False False False False ... True True True True
    norm     (run) float64 38.14 53.41 58.2 62.76 66.24 54.13 53.62
Attributes:
    normType:  {'type': 'max', 'scope': 'global'}
In [17]:
# For rapid plotting per run, use self.plotSpectra()
# Note this defaults to the full data range, aside from a small central mask by default
data.plotSpectra(filterSet = filterSet)
In [18]:
# Set an energy mask and replot
data.setRmask(filterSet, eRange = [0.5,50])

# This can also be set to return a HoloMap for more flexibility
hmap = data.plotSpectra(returnMap = True, filterSet = filterSet)
hmap.overlay('BLM').layout('run').cols(1)
Out[18]:
In [19]:
# Here we'll also renormalise to the feature over ~25-30eV

data.renorm(norm={'type':'max', 'scope':'run'}, filterSet=filterSet, eRange=[25,30])

hmap = data.plotSpectra(returnMap = True, filterSet = filterSet, useMask=False, ePlot=[0.1,50])
hmap.overlay('run').layout('BLM').cols(1)
Out[19]:

Inversion details

Various other stages and results are also logged to the data strucure...

Images are set to 'filterSet-TYPE', where TYPE is:

  • 'fold': Quadrant filter result if quadrant filter is set. These are useful for checking centre point is OK. (TODO: implement quadrant.unfoldQuadrant for full symmetrized images.)
  • 'fit': Reconstructed/fitted images.
  • 'inv': Inverted images

These can be accessed and plotted with the usual functionality.

NOTE: there is currently no default radial masking set, so the default plot scaling will likely be off - adjust with the histogram.

In [21]:
# Quad filter 
data.showImg(name=filterSet + '-fold')
WARNING:param.RasterPlot58937: Due to internal constraints, when aspect and width/height is set, the bokeh backend uses those values as frame_width/frame_height instead. This ensures the aspect is respected, but means that the plot might be slightly larger than anticipated. Set the frame_width/frame_height explicitly to suppress this warning.
In [23]:
data.showImg(name=filterSet + '-inv')
WARNING:param.RasterPlot59637: Due to internal constraints, when aspect and width/height is set, the bokeh backend uses those values as frame_width/frame_height instead. This ensures the aspect is respected, but means that the plot might be slightly larger than anticipated. Set the frame_width/frame_height explicitly to suppress this warning.
In [32]:
# Raw cpbasex output data
data.proc['sub']['pbasex'].keys()
Out[32]:
dict_keys(['E', 'IE', 'betas', 'c', 'fit', 'inv'])

Uncertainties

A basis Poissonian bootstrapping routine is implemented for estimating uncertainties. This operates on the event data, and allows error propagation through the image generation & analysis, but is (consequently) a little slow. (For more on the method employed, see the wikipedeia page#Poisson_bootstrap).)

Note this routine is a little crude at the moment! It will run image generation for all filterSets, run image subtraction, then image inversion & stats for all filterSets, or just the listed element.

In [25]:
# Run bootstrapping & inversion routine, 5 iterations, and calculate stats for the subtracted data only.
data.bootstrapInv(N = 5, filterSet = 'sub')
*** Bootstrap run for N=5, lambda=1.0, filterSet=sub
Running set 1 of 5
Generating VMI images for filters: signal
Generating VMI images for filters: bg
/reg/d/psdm/tmo/tmolw0618/results/modules/pbasex/pbasex.py:31: RuntimeWarning: invalid value encountered in true_divide
  betas = IEB[:,nim:].reshape(nx,nl-1,nim)/IE[:,None,:]
Running set 2 of 5
Generating VMI images for filters: signal
Generating VMI images for filters: bg
/reg/d/psdm/tmo/tmolw0618/results/modules/pbasex/pbasex.py:31: RuntimeWarning: invalid value encountered in true_divide
  betas = IEB[:,nim:].reshape(nx,nl-1,nim)/IE[:,None,:]
Running set 3 of 5
Generating VMI images for filters: signal
Generating VMI images for filters: bg
/reg/d/psdm/tmo/tmolw0618/results/modules/pbasex/pbasex.py:31: RuntimeWarning: invalid value encountered in true_divide
  betas = IEB[:,nim:].reshape(nx,nl-1,nim)/IE[:,None,:]
Running set 4 of 5
Generating VMI images for filters: signal
Generating VMI images for filters: bg
/reg/d/psdm/tmo/tmolw0618/results/modules/pbasex/pbasex.py:31: RuntimeWarning: invalid value encountered in true_divide
  betas = IEB[:,nim:].reshape(nx,nl-1,nim)/IE[:,None,:]
Running set 5 of 5
Generating VMI images for filters: signal
Generating VMI images for filters: bg
/reg/d/psdm/tmo/tmolw0618/results/modules/pbasex/pbasex.py:31: RuntimeWarning: invalid value encountered in true_divide
  betas = IEB[:,nim:].reshape(nx,nl-1,nim)/IE[:,None,:]
In [26]:
# Stats are output to a dataset, with variables per iteration
data.stats
Out[26]:
<xarray.Dataset>
Dimensions:  (BLM: 3, E: 512, run: 7)
Coordinates:
  * BLM      (BLM) int64 0 2 4
  * E        (E) float64 0.0 0.000359 0.001436 0.003231 ... 93.01 93.38 93.74
  * run      (run) int64 81 82 83 84 85 86 87
    XS       (E, run) float64 0.0 0.0 0.0 0.0 0.0 ... 2.271 2.727 3.085 4.376
    pixel    (E) int64 0 1 2 3 4 5 6 7 8 ... 503 504 505 506 507 508 509 510 511
    mask     (E, run) bool False False False False False ... True True True True
    norm     (run) float64 38.63 54.49 57.52 63.13 63.4 52.59 53.09
Data variables:
    0        (BLM, E, run) float64 0.0 0.0 0.0 0.0 ... -0.6433 -0.6386 -0.6494
    1        (BLM, E, run) float64 0.0 0.0 0.0 0.0 ... -0.659 -0.5828 -0.6737
    2        (BLM, E, run) float64 0.0 0.0 0.0 0.0 ... -0.6514 -0.6248 -0.6592
    3        (BLM, E, run) float64 0.0 0.0 0.0 0.0 ... -0.6295 -0.6071 -0.6531
    4        (BLM, E, run) float64 0.0 0.0 0.0 0.0 ... -0.6353 -0.5913 -0.6556
In [27]:
# Similarly to plotSpectra(), there is a plotUncertainties() function for this data
data.plotUncertainties()
In [28]:
# As before, an object can be returned for further plotting options
hmap = data.plotUncertainties(returnMap=True)
hmap.overlay('BLM') #.cols(1)
Out[28]:
In [29]:
hmap.overlay('run').layout('BLM').cols(1)
Out[29]:

Versions

In [30]:
# tmo-dev class version
data.__version__
Out[30]:
'0.0.1'
In [31]:
import scooby
scooby.Report(additional=['holoviews', 'xarray'])
Out[31]:
Tue Dec 01 18:17:24 2020 PST
OS Linux CPU(s) 16 Machine x86_64
Architecture 64bit RAM 125.6 GB Environment Jupyter
Python 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08) [GCC 7.5.0]
holoviews 1.13.5 xarray 0.16.1 numpy 1.19.4
scipy 1.5.3 IPython 7.19.0 matplotlib 3.3.2
scooby 0.5.6